Time-Space Trade-Offs for Longest Common Extensions
نویسندگان
چکیده
We revisit the longest common extension (LCE) problem, that is, preprocess a string T into a compact data structure that supports fast LCE queries. An LCE query takes a pair (i, j) of indices in T and returns the length of the longest common prefix of the suffixes of T starting at positions i and j. We study the time-space trade-offs for the problem, that is, the space used for the data structure vs. the worst-case time for answering an LCE query. Let n be the length of T . Given a parameter τ , 1 ≤ τ ≤ n, we show how to achieve either O(n/ √ τ) space and O(τ) query time, or O(n/τ) space and O(τ log(|LCE(i, j)|/τ)) query time, where |LCE(i, j)| denotes the length of the LCE returned by the query. These bounds provide the first smooth trade-offs for the LCE problem and almost match the previously known bounds at the extremes when τ = 1 or τ = n. We apply the result to obtain improved bounds for several applications where the LCE problem is the computational bottleneck, including approximate string matching and computing palindromes. We also present an efficient technique to reduce LCE queries on two strings to one string. Finally, we give a lower bound on the time-space product for LCE data structures in the non-uniform cell probe model showing that our second trade-off is nearly optimal.
منابع مشابه
Longest Common Extensions in Sublinear Space
The longest common extension problem (LCE problem) is to construct a data structure for an input string T of length n that supports LCE(i, j) queries. Such a query returns the length of the longest common prefix of the suffixes starting at positions i and j in T . This classic problem has a well-known solution that uses O(n) space and O(1) query time. In this paper we show that for any trade-of...
متن کاملTime-Space Trade-Offs for the Longest Common Substring Problem
The Longest Common Substring problem is to compute the longest substring which occurs in at least d ≥ 2 of m strings of total length n. In this paper we ask the question whether this problem allows a deterministic time-space trade-off using O(n) time and O(n1−ε) space for 0 ≤ ε ≤ 1. We give a positive answer in the case of two strings (d = m = 2) and 0 < ε ≤ 1/3. In the general case where 2 ≤ d...
متن کاملEthical Perspective: Five Unacceptable Trade-offs on the Path to Universal Health Coverage
This article discusses what ethicists have called “unacceptable trade-offs” in health policy choices related to universal health coverage (UHC). Since the fiscal space is constrained, trade-offs need to be made. But some trade-offs are unacceptable on the path to universal coverage. Unacceptable choices include, among other examples from low-income countries, to expand coverage for services wit...
متن کاملSampled Longest Common Prefix Array
When augmented with the longest common prefix (LCP) array and some other structures, the suffix array can solve many string processing problems in optimal time and space. A compressed representation of the LCP array is also one of the main building blocks in many compressed suffix tree proposals. In this paper, we describe a new compressed LCP representation: the sampled LCP array. We show that...
متن کاملSublinear Space Algorithms for the Longest Common Substring Problem
Given m documents of total length n, we consider the problem of finding a longest string common to at least d ≥ 2 of the documents. This problem is known as the longest common substring (LCS) problem and has a classic O(n) space and O(n) time solution (Weiner [FOCS’73], Hui [CPM’92]). However, the use of linear space is impractical in many applications. In this paper we show that for any trade-...
متن کامل